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INTRODUCTION 


Thf>  goal  of  designing  computer  vision  algorithms  for  image  recognition  systems, 
broadly  stated,  is  to  produce  automatic  tools  to  acquire,  manipulate,  understand,  and 
process  imagery  information,  and  through  this,  to  identify  the  objects  that  generate  such 
images.  This  goal  is  made  difficult  to  achieve  not  only  by  the  large  amount  of  informa¬ 
tion  contained  in  an  image  and  the  variability  under  which  imagery  data  can  be 
produced,  but  also  by  the  number  of  different  aspects  that  the  same  object  can  have  in 
different  images  (ref  1).  In  an  effort  to  understand  these  problems,  researchers  have 
isolated  the  details  of  the  behavior  of  the  particular  components  of  an  image  recognition 
system,  and  studied  them  under  controlled  conditions.  This  has  led  to  a  number  of 
image  models,  and  a  greater  understanding  of  the  physics  of  images.  Another  outcome 
of  this  effort  was  a  large  number  of  special  purpose  algorithms,  capable  of  handling  a 
limited  array  of  image  variations. 

These  stand-alone  components  introduced  system  integration  problems;  *he  com¬ 
ponents  were  designed  with  no  knowledge  about  each  other,  and  with  expectations 
about  the  preceding  and  succeeding  stages  of  the  system  which  were  not  always 
satisfied.  Therefore,  with  robustness  as  a  paramount  issue,  the  goal  of  total  system 
integration  has  been  a  fleeting  objective  for  the  past  20  years,  and  it  has  become  very 
clear  that  the  components  for  an  integrated  system  must  adhere  to  standards  of  per¬ 
formance  and  robustness  which  make  them  computationally  impractical.  Not  only  do 
they  require  a  large  amount  of  hardware  (both  memory  and  computer  power),  but  as  'hp 
number  of  variations  the  system  must  handle  is  increased,  so  does  the  serial  nature  o' 
the  algorithms,  removing  the  hopes  of  full  parallel  processing  computation. 

It  was  also  clear  from  the  lessons  of  Nature,  that  this  trend  was  running  contrary  to 
the  underlying  theme  seen  in  natural  vision  systems.  The  natural  design,  exposed  by 
more  recent  neurophysiological  research,  is  formed  by  simple,  parallel,  cooperating- 
competing  systems.  The  natural  design  is  also  carefully  harmonized  with  the  environ¬ 
ment  of  the  animal  and  the  purpose  for  which  visual  information  is  to  be  used.  With  the 
advent  of  more  mature  research  into  the  architecture  and  the  behavioral  aspects  of  the 
visual  nervious  systems,  an  entirely  new  philosophy  of  image  recognition  system  design 
could  be  embraced.  This  philosophy  concentrated  on  the  design  of  simple  and  mas¬ 
sively  parallel  structures,  with  a  simultaneous  application  of  a  set  of  mutally  satisfying 
constraints  over  the  entire  image  (ref  2).  However,  it  was  noticed  that  each  individual 
algorithm  resulting  from  this  new  philosophy  did  not  perform  as  well  as  its  more  com¬ 
plex  classical  counterpart;  not  all  image  variations  couid  be  taken  into  consideration.  To 
circumvent  this  problem,  some  assumptions  needed  to  be  made  about  the  image.  This 
was  the  same  shortcoming  present  in  the  classical  case.  Nevertheless,  the  new  design 
offered  an  unprecedented  opportunity  to  circumvent  this  problem;  the  highly  decentral¬ 
ized  and  parallel  algorithms  permitted  mutal  interaction  and  tuning.  This  allowed  pursuit 
of  designs  that  contain  competing  and  cooperating  algorithms;  when  an  algorithm  did 
not  perform  well  under  certain  types  of  image  variations,  another  was  used  (ref  3). 
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The  information  needed  to  hypothesize  the  properties  of  the  image  data  could  now 
be  taken  from  the  output  of  higher  level  components  of  the  system.  This  permitted  the 
initial  estimates  constantly  to  be  refined  throughout  the  processing  time  (resonance). 
The  entire  system  became  tightly  connected  and  robust,  although  it  still  maintained 
basic  simplicity  and  a  high  degree  of  parallelism. 


DESIGN  PROCEDURE  FOR  RESONATING  ALGORITHMS 

No  matter  how  complex  the  algorithm,  it  can  never  be  designed  to  successfully 
account  for  all  possible  aspect  variations  that  an  object  can  produce  in  real  images. 
The  more  complete  the  algorithm,  the  more  serial  and  computationally  expensive  it 
becomes.  With  complex  imagery,  there  will  always  be  cases  that  have  not  been  con¬ 
sidered.  Since  it  is  impossible  to  consider  all  cases  beforehand,  an  alternative 
‘^mdology  is  proposed  that  employs  simple  and  robust  parallel  algorithms  that  are 
iror.tv.;  t  '  I  ?ct  according  to  a  set  of  parallel  multiple  constraints  (fig.  1). 


" Iqor  1  thrr  Con- pleteness 


Figure  1  Cost  and  parallelism  variation  versus  completeness 

Thpse  algorithms  are  designed  by  stating  the  list  of  physical  phenomena  that  give 
tisp  to  important  characteristics  of  the  image  and  at  the  same  time  constrain  the  pos¬ 
sible  ditfemncGs  in  object  appearance.  For  example,  if  the  object  of  interest  is  white, 
•'IPS  nt  near  random  patterns,  and  usually  will  be  seen  at  night,  then  the  algorithm 
■  '  i!d  concentrate  on  detecting  a  bright  pixel  grouping  which  represents  a  discontinuity 
IP  tt'p  dark  background  in  both  time  and  space.  Such  constraints,  once  applied  over  the 
entire  image,  should  cause  automatic  disregard  of  large  fixed  white  backgrounds,  speck 
sir^e  while  patterns  (e  g.,  sensor  noise),  and  any  other  nonwhite  region.  A  simple  way  to 
'■,aii-;fy  tiiese  constraints  in  parallel  is  to  have  every  pixel  check  its  value  and  the  values 
.h  i‘n  npirjtihni's  m  time  and  Space.  If  the  pixel  finds  itself  with  a  white  measurement 
nnno'inded  bv  a  dark  background,  it  labels  itself  as  background  and  discards  the  white 


measurement  as  noise.  If  it  has  a  white  neighborhood  but  it  is  fixed  in  time,  it  also 
discards  the  measurement  labeling  itself  as  background.  Only  when  it  finds  discon¬ 
tinuity  in  time  with  a  consistent  white  neighborhood  in  space,  does  it  take  the  label  of  an 
object  of  interest.  The  hypothesis  created  at  this  stage  will  be  confirmed  by  later  stages 
of  processing.  This  type  of  constraint  satisfaction  problem  has  been  demonstrated  to 
be  equivalent  to  resistive  or  simple  nonlinear,  uniform  analog  circuits. 

The  equations  described  by  such  a  process  as  noncasual  in  both  space  and  time. 
This  suggests  that  a  single  pixel  cannot  definitely  label  itself  until  all  other  pixels  are 
labeled  in  the  image.  To  solve  this  problem,  processing  is  done  in  stages.  Initially,  the 
pixels  take  an  educated  gues  at  their  labels;  then,  competition  allows  for  each  pixel  in 
parallel  to  look  at  its  neighborhood  and  adjust  its  value.  This  process  corresponds  to  a 
descent  in  the  energy  space  of  the  system,  settling  down  in  the  best  compromise  of  all 
hypotheses  (not  necessarily  the  most  optimum  compromise).  The  final  interpretation  is 
said  to  resonate  in  the  system,  and  the  values  of  the  pixel  classification  are  said  to  be  at 
a  fixed  point  (this  is  the  first  type  of  resonance  described)  (fig.  2). 

Notice  that  the  design  of  such  an  algorithm  could  have  benefited  from  another 
algorithm  which  would  be  carefully  tuned  to  discern  movement  only  with  a  certain  range 
of  speeds.  This  algorithm  could  pick  up  objects  of  interest  by  analyzing  motion,  and 
could  be  used  to  predict  the  future  object  position  since  it  retains  information  about  the 
trajectory.  It  would  not  be  able  to  discern  among  objects  of  different  colors  in  the  1 0  ""° 
way  tht  the  previous  algorithm  could  not  discern  among  slower  or  faster  moving  objeci, 


Figure  2.  Intralevel  resonance  between  algorithms 
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By  combining  the  output  of  both  algorithms,  a  better  interpretation  of  the  image 
could  be  achieved.  In  this  particular  case,  after  the  internal  resonance  process,  the 
algorithms  would  exchange  and  mutally  reinforce  areas  of  interest,  based  on  how  much 
ttiey  satisfy  tlie  constraints.  This  would  allow  the  pixels  another  chance  to  review  their 
labels  based  on  additional  information.  This  process  is  repeated  until  another  resonant 
interpretation  is  found  between  the  algorithms.  Notice  the  much  slow  time  scale  of  this 
second  process.  Both  algorithms  still  retain  the  qualities  of  parallelism,  falut  tolerance, 
and  simplicity. 

A  third  level  of  processing  is  added  to  increase  the  level  of  confidence  in  the  inter- 
pipcition.  The  prevous  resonance  pointed  out  the  areas  of  interest  in  the  image  and 
f’oduced  a  large  reduction  in  the  data  set  size.  However,  nothing  was  said  about  how 
nnatit  algorithm  i  should  expect  the  object  to  be,  or  about  how  fast  algorithm  2  should 
•’cr  the  otijeci  Notice  that  if  the  interpretation  is  correct,  the  levels  of  confidence  of 
-  will  increase,  and  the  interpretation  is  said  to  be  "locked"  (they  are 
lautaiiy  remioiceo'  i'  a  ’-■o*,  the  process  decreases  the  initial  level  of  confidence,  and 
‘  I-  'ppeated  lot  a  interpretation  confidence  will  drop  below 

acceptable  limits,  and  the  region  will  be  disoa’'ded  as  noise.  This  third  type  of 
resonance  happens  in  a  much  slower  time  scale  than  the  others,  and  it  is  responsible 
for  rpmov'ing  ambiguities  (fig.  3). 


Ttie  third  level  can  be  used  to  finalize  the  interpretation  by  imposing  additional 
context"  consuaints  to  the  previous  results.  Assume  tor  example,  that  the  object  of 
interest  has  specific  patterns  on  its  back  and  it  flies  only  in  the  fonArard  direction.  The 
p-esence  of  high  level  features  can  be  used  to  confirm  the  object  identification.  A 
-lompanion  algorithm  could  take  into  account  new  hgih  level  features  such  as  the 
position  of  fixed  ligh  sources,  and  adjust  the  expected  brightness  value  in  algorithm  1. 
I'  tfiis  interpretation  is  fed  back  to  the  lower  levels,  resonance  of  the  third  kind  described 
•  pirtre,  f'irther  enhancing  the  interpretation. 


Level  3 
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The  advantages  of  the  separate  modules  as  opposed  to  a  monolithic  desiqn  art’’ 
simplicity,  adaptability,  parallelism,  fault  tolerance,  and  robustness.  The  first  benefits 
are  rather  intuitive,  and  robustness  reaiy  illustrated.  If  the  context  is  such  that  one  of 
the  modules  is  rendered  useless,  there  is  enough  support  from  other  constraints  to 
make  the  correct  interpretation  (e.g.,  an  object  flying  a  straight  collision  course  could 
render  the  speed  detector  useless  but  not  the  brightness  detector).  This  kind  of 
degenerate  sitation  is  common  in  image  interpretation  system,  and  is  the  cause  of  the 
poor  performance  of  integrated  systems.  The  types  of  highly  parallel  algorithm  with 
simple  interconnecting  elements  as  described  here  are  known  in  the  literature  as  con- 
nectionist  algorithms  or  neural  networks  (refs  4  and  5). 


SPECIAL  PURPOSE  FEATURE  BASED  SYSTEM 

To  understand  these  concepts  further  and  to  study  the  stability  and  effectiveness  of 
the  resonant  mechanisms  in  a  shape  recognition  problem,  a  simple  but  very  useful 
simulation  based  on  Fourier  and  moments  descriptors  was  constructed.  To  decrease 
the  computational  load,  the  system  was  broken  down  into  levels,  each  being  respon¬ 
sible  for  abstracting  the  data  sent  to  another  level.  The  system  was  designed  to  cope 
with  a  large  number  of  sensed  images  which  differ  both  in  spectra  and  in  time.  The 
system  should  be  able  to  recognize  objects  of  different  classes,  same  class  and  multi¬ 
ple  occurence,  and  multiple  perspective,  and  it  should  handle  large  amounts  of  noise  as 
well  as  shape  occlusion. 

The  system  consists  of  three  levels:  a  segmentation  level,  a  feature  extraction 
level,  and  an  object  classification  level  (fig,  4).  The  segmentation  level,  although  being 
general,  can  enhance  its  performance  with  the  knowledge  obtained  about  object  iden¬ 
tification.  position,  and  orientation.  Object  identification  information  is  provided  by  the 
tope  level  which  works  as  an  heteroassociative  bidirectional  memory  (ref  6).  The 
second  level  receives  the  segmented  image  and  proceeds  to  extract  features.  It  is  also 
capable  of  receiving  the  correct  feature  set,  and  reversing  it  to  produce  a  model  image 
that  can  then  be  adjusted  to  match  the  input  image.  The  top  level,  after  receiving  a 
feature  vector  with  noise,  produces  a  nearest  neighbor  classification  of  it,  and  as  the 
output  of  the  associative  memory,  gives  information  to  the  first  level  about  the  clas¬ 
sification  for  parametric  tuning.  Since  it  is  a  bidirectional  memory,  it  alters  the  feature 
vector  to  match  the  classifier  i(i(erprettion;  this,  m  turn,  forces  the  second  level  to  correct 
pixels  misclassified  by  the  first  level  (i.e.,  occluded  pixels).  This  entire  process  works  on 
the  resonance  principles  described  above  (fig.  4). 


segm.  features  BAM 


Figure  4.  Diagram  of  the  shape  recognition  system 

Segmentation  Algorithm 

[he  segmentation  algorithm  is  responsible  for  filtering  out  possible  objects  in  the 
image  for  fmal  interpretation  To  proceed  in  our  design,  let's  define  a  set  of  physical 
'■onstraints  to  construct  the  segmentation  algorithm  at  the  first  level.  For  the  sake  of 
Simplicity  in  our  discussion,  only  the  following  two  assumptions  about  the  objects  were 

: ' '  ncjo 

I  Homogeneity:  every  object  of  interest  is  made  of  the  same  material  and, 
fi:o-f-.forp.  stir;uici  pi'oduce  Similar  visual  results. 

^  I  cntmuify  every  object  of  interest  is,  for  the  most  part,  solid  and.  there- 
'-'ric'id  in  dosed  neighborhoods. 

[  ii-rt  tv'.m’  ac'ij.cipfioris  am  fairly  general,  and  they  hold  in  a  variety  of  situations, 
fhP  r3oundary  pixels  of  our  objects  as  will  be  seen  in  the  example  The  mathe- 
''Vifiro-,!  modei  tfiat  fias  given  rise  to  the  study  of  spatially  dependent  phenomena  in 
ipguiar  neigfihorhoods  is  called  Markov  Random  Fields,  and  it  will  be  used  tomodel  the 
-luify  acminiption  If  a  cmtam  range  of  brightness  values  is  accepted  as  repre 
p  V,*  /p  of  ifip  material  >.,fiar,acfonstics.  a  likelihood  function  is  a  good  candidate  for  the 
1-ri  !  rpf-;  tl'rouqfl  1  1  I 


Problem  Formulation 


The  problem  of  segmentation  is  defined  as  grouping  parts  of  a  generalized  image 
into  regions,  homogeneous  with  respect  to  some  characteristics  or  features  and  result^ 
ing  in  a  partitioned  image  (refs  9.  10  and  12  through  14). 

Define  a  picture  F  =  f(x,y)  as  a  two  dimensional  intensity  function  f(x,y)  The 
quantized  version  of  f(x,y)  in  both  spatial  coordinates  and  intensity  is  denoted  by  thn 
matrix  G  =  [g  j  (N,  x  N,),  F  is  composed  of  M  different  legions  (which  can  occur  an 

arbitrary  number  of  times  each),  and  through  the  use  of  different  sensors,  K  distinct 
images  =  1  ..K  of  the  same  scene  can  be  obtained. 


Also,  assume  that  each  element  g  ij  is  actually  the  sum  of  b  ij  and  ^  ij  and  pixel  n.)) 
being  in  region  m  through  observation  k 

g'lj  =  b'i]  +  ii'ij  <i) 

where  {bSj}  and  are  stochastic  fields  characterizing  the  underlying  sc  ^ne,  and  ttie 
observation  noise,  respectively,  in  the  data  set  k.  A  further  simplifying  as'  umphon  is 
made:  each  region  type  m  in  each  data  set  k  can  be  characterized  by  ^  concm--’ 
intensity,  r'‘m,  which  is  the  mean  of  that  region  (i,e,,  b*'ij  =  r''m),  if  the  pixel  c,;- 
region  type  m.  Furthermore,  the  additive  noise  field  ii'^ij  is  assumed  to  be  Spain  , 
uncorrelated  and  Gaussian  so  that  the  vector  of  the  observation  noise 

.  1.,  2  -  K..,  T 

n,|  =  (n  IJ.  n  'i . n  ij] 

IS  a  multivariate  normal  with  mean  zero  and  covariance  matrix  in  region  m  Ic  ^ 
implies  that  the  observation  vector 

g„  =  [g'ij.  g'lj . g"')]^ 

IS  a  multivariate  normal  with  mean  r 

r  1  2  K  ,1 

y  =  [r  m,  r  m . r  mj 


and  covariance  if  pixel  (i,i)  is  in  region  type  m 

The  segmentation  problem  can  be  stated  as  mapping  fT  into  a  main.  H' 
from  an  estimate  of  the  sets  S  =  {S  }  m  =  1  ..M  where 

/ 


(5) 


r 


=  {(i.j):  bNj  =  r'^m} 

S'"  =  [b'^ij]  (N^  X  Nj):  b^'ij  e  [1  ...M]  and  b^ij  =  m  iff  (i,j)  e  (6) 

meaning,  B''  is  an  M-level  image  matrix,  where  b''ij  =  m  if  contains  pixel  (i,j). 

1  ising  the  classical  maximum  likelihood  segmentation  method,  define: 

p„,(x)  =  (2;c)""|C,J  '  Wl(x  -  r^)'c' m  (x  -  r„)]  (7) 


Jhe  segmentation  procedure  will  assign  pixel  (i,j)  to  region  set  if 

P,„  (gj  >=  P  (g„)  and  1  =<  I  =<  M  (8) 

'‘ii  -■  metiicfi  or)ly  woiks  vvihi  i'  *’  -  '•'^nal-tonosie  ratio  is  s/n  H  iVa,  where  AE  -  r^. 

To  develop  a  more  robust  procedure,  it  is  necessary  to  bring  other  constraints  into 
the  model.  In  this  case,  it  is  assumed  that  solids  objects  will  appear  in  connected  blobs 
■•'I  subsets  At  the  pixel  level,  this  would  imply  that  for  (i,j)  to  belong  to  region  m 

b  =  m  {?lb  ^  p  I  ^  j.;  b  ^  P  ^  g  =  m  A  e  E  {- 1 , 0,  1},  (i  +  e,  j  +  e)  (i,j)}  (9) 

I  tiis  can  be  modeled  by  a  Markov  field  with  8-nearest  neighbors  defining  the  process 
=m[.''poi1  Assuming  this  limited  support,  the  Markov  process  can  be  characterized  by 
til'-  frnnsiticn  probabilities 


pib'ij  -  r\n,  1  =<:  k  =<  K  j  b'^rs 
1  -<  r  -<  N,,  1  =<  s  =<  Nj 

(r.s)  7^  (i,j;,  1=.<  k  -<  K)  = 

M/phj  -  Ml,  1  - -- K K  I  b''rs  (r,s)  f  (T.  1=<  k  =<  K)  = 

(10) 

('  IS  tfie  local  neiqlibothood  of  pixel  (i,  j)  as  in  equation  9.  The  segmentation 

•■fioipni  r-an  now  be  formulated  as  a  maximum  A  posteriori  probability  (MAP)  estimation 
prMhiein  In  f)ariicular,  let  l(.)  represent  a  log-likelihood  function.  We  would  then  like  to 
fi-iri  tho  optimat“  S'  whicfi  maximizes  tfie  conditional  likelihood 

KS  1'-)  -  l(GiS')  ♦  l(S')  -  KG)  (11) 
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or  since  1(G)  is  independent  of  S^,  more  simply  we  can  minimize 


1(S^G)=  1(G/S'')  +  )(S^) 
In  this  case, 


1(G/S'')  =  I 


M 


=  1  (i.ites,, 


{gj} 


(12) 


(13) 


and 


1(S") 


I 

m-.1  ni.|)es„ 


where  p^  and  are  defined  in  equation  8  and  10,  respectively. 


(14) 


The  solution  to  this  equation  can  be  found  in  parallel,  by  applying  these  constraints 
to  all  the  pixels  of  the  image,  which  have  been  utilized  with  the  brightness  value  of  each 
point.  The  relaxation  process  that  follows  decreases  the  overall  energy  of  the  sytem  by 
modifying  each  individual  pixel  classification,  or  the  sets  S  =  (SJ  m=1  ...M  (refs  13  and 

14).  Similar  equations  have  also  been  the  motivation  for  optimization  in  such  relaxation 
lattices  (refs  15  through  18). 


An  airplane  image  51 2  x  480,  subject  to  glare,  noise,  and  a  nonuniforfm  backgrui.  . 
is  shown  in  figure  5.  Using  simply  the  constraints  above,  the  parallel  implementation  ot 
the  segmentation  algorithm  relaxes  to  the  interpretation  given  in  figure  6.  Notice  that 
only  the  first  king  of  resonance  is  at  work,  and  that  the  constraints  are  extremely  simple. 
More  sophisticated  schemes  are  capable  of  handling  textures  during  the  segmentation 
process  (ref  19). 


FEATURE  EXTRACTION  WITH  CONNECTIST  MODELS 

There  has  been  a  considerable  amount  of  work  done  on  shape  encoding  and 
recognition  using  moments  and  Fourier  descriptors  (refs  20  through  28).  The  decision 
to  use  such  models  was  based  on  the  fact  that  this  procedure  is  simple  and  well  under¬ 
stood,  and  can  be  made  reversible. 

In  this  system,  the  feature  extractor  module  receives  a  portion  of  the  image  from  the 
segmentor.  This  image  portion  contains  only  the  labels  of  the  pixels  and  not  the  pixel 
values  themselves  (fig.  4).  The  module  then  applies  a  transformation  to  the  image  and 
produces  at  the  output  the  important  features  to  describe  the  object.  The  choice  of  the 
set  of  transformations  and  features  can  be  made  abitrarily  complex,  but  for  the  sake  of 
simplicity,  only  the  Fourier  descriptor  method  is  discussed. 
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Fourier  Descriptor  Method 

There  are  two  ways  of  designing  a  high  parallel,  feature-based  module.  (1)  The 
more  classical  methods  of  extracting  parallelism  from  a  given  algorithm  can  be  applied. 
The  dataflow  graph  of  the  algorithm  is  created,  a  careful  investigation  of  the  application 
of  various  transformations  is  made,  and  the  resulting  data  flow  graph  is  compared  for 
parallelism  and  speedup  (i.e.,  increased  breadth  and  reduced  depth  of  the  graph).  After 
the  final  transformation,  the  algorithm  is  hardwired  into  the  computer  architecture.  (2)  A 
less  traditional  and  perhaps  more  interesting  approach  is  to  take  advantage  of  the 
adaptive  characteristic  of  connectionist  models.  There,  we  start  with  the  parallel  model, 
and  using  a  learning  algorithm,  adapt  the  system  until  the  cohort  of  elements  produces 
the  desired  transform.  This  model  sometimes  leads  to  approximations  of  the  original 
tnansforms  due  to  the  need  for  limiting  the  number  of  processors  used.  Besides  leading 
,  ■  aoproximations  in  most  cases  (ref  29),  our  application  can  make  good  use  of 
this  aaapuvu  method  since,  on  a  practical  implementation,  even  the  successful  tradi- 
‘iona!  transform  designs  wou'd  also  be  subject  to  approximation  and  truncations  (ref 
28). 


With  this  adpative  method,  an  inverse  transform  module,  which  would  feedback  the 
correct  pixel  classification  to  the  segmentation  unit  after  recognition  can  be  designed.  It 
IS  important  to  be  able  to  reconstruct  the  original  image  after  recognition  for  a  variety  of 
reasons.  Perhaps,  the  most  important  one  is  the  ability  to  deal  with  multiple  objects  and 
occlusions.  This  works  as  follows:  Assume  that  the  segmentation  algorithm  can 
separate  the  image  into  several  different  sets  of  pixels  according  to  some  homogenity 
criteria.  The  identification  system  would  be  more  sensitized  by  the  object  which  is  the 
largest  or  with  least  amount  of  occlusion,  the  least  noisy,  or  the  least  ambiguous.  After 
the  identification  is  correctly  done,  the  system  can  send  an  inhibitory  signal  to  all  the 
pixels  that  participate  in  that  object,  and  it  would  be  free  to  pick  up  the  next  prominent 
available  object  even  if  it  was  partially  occluded  by  the  first.  In  this  global  serial  mode, 
the  system  proceeds  to  identify  each  class  of  pixels,  one  at  a  time,  through  the  same 
resonating  scheme. 

Fourier  and  moments  methods  have  been  shown  to  deal  extremely  well  with  object 
recognition  at  varying  .3-D  perspective,  producing  not  only  the  recognition  of  the  object 
nut  the  orientation  (ref  28).  In  the  same  reference,  the  two  methods  are  compared  with 
other  mettiods.  such,  as  range  moments.  Fourier  and  moments  methods  can  be  made 
I!  varinnt  to  position,  rotation,  and  scaling  by  adding  the  proper  modifications.  Although 
this  property  is  valuable  during  the  classification  process,  it  is  necessary  for  a  quick 
reconstruction  of  the  image  to  preserve  information  about  these  varying  attributes.  For 
exanipie.  a  position  invariant  transformation  generates  features  that,  when  inverted,  can 
f,;v.ya  •  oyorai  infnrpretatior's  m  the  image.  To  solve  this  problem,  the  value  of  the 
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centroid  of  the  region  and  the  angle  of  its  main  axis  can  be  retained  as  one  of  the 
features.  These  features  would  aid  the  inverse  transformation  in  the  pixel  inhibiting 
operation. 

Adaptive  Methods  for  Feature  Extraction 

If  it  is  desired  to  design  even  more  sophisticated  models  of  feature  extraction  and 
use  the  features  to  describe  object  identification,  orientation,  and  absolute  and  relative 
positions,  the  adaptive  qualities  ofthe  connectist  models  can  be  used  again.  One  of  the 
classes  of  adaptive  algorithms  is  called  supervised  learning.  The  most  popular  of  such 
algorithms  for  highly  parallel  networks  is  called  the  Generalized  Delta  Rule  (Ref  4).  The 
user  provides  the  network  with  a  careful  choice  of  examples  accompanied  by  the 
desired  response  of  the  system.  The  network  proceeds  to  alter  its  parameters  until  the 
desired  transfer  function  can  be  emulated.  Mathematical  studies  of  such  a  model  have 
shown  that  unlimited  networks  can  approximate  an  arbitrary  function  arbitrarily  close. 
These  issues  are  beyond  the  scope  of  this  work. 

BIDIRECTIONAL  ASSOCIATIVE  MEMORIES 

The  highest  module  in  this  identification  system  is  the  bidirectional  associative 
memory.  It  doubles  as  a  nearest  neighbor  classifier  and  a  library  look  up,  but  because 
of  its  parallel  structure,  can  do  both  very  efficiently.  There  are  two  characteristics  ic  the 
design  of  such  a  memory  that  appeal  to  this  problem;  (1)  It  can  deal  with  noise 
rupted  features  very  well,  even  in  the  presence  of  hardware  damage  and  (2)  it  uses  a 
resonating  structure  (heteroassociative)  which  recalls  related  information  to  the  input, 
uses  this  new  information  to  correct  the  input,  and  then  reuses  the  new  input  for  recall. 
This  process  continues  until  a  perfect  match  is  found. 

There  are  severalways  that  this  type  of  memory  could  be  constructed.  The  first  to 
suggest  a  possible  design  was  Bart  Kosko  (ref  6).  This  design  displays  all  the  desirable 
characteristics  discussed  above.  One  major  drawback  that  has  driven  the  research  in 
this  area  is  the  storage  capacity  of  such  memories  (ref  30).  The  number  of  possible 
patterns  is  less  than  the  sum  of  the  lengths  on  the  input  and  output  feature  vectors. 
After  this  number,  the  meory  tends  to  generalize  and  fuse  patterns  together  and  some- 
stimes  generate  spurious  memories,  combinations  of  other  recalls. 

Basically,  Kosko's  memory  works  through  the  following  computation;  Let  A  be  the 
input  feature  vector,  and  B  be  the  associated  output  vector.  When  A  is  received,  a 
multiplication  of  A  by  a  matrix  M  is  performed  to  generate  B.  M  represents  the  long 
term  storage  memory  of  the  system.  Once  B  is  obtained,  a  multiplication  of  B  with  M  ’  is 
performed  to  generate  A.  If  A  is  corrupted  with  noise,  then  the  recalled  version  of  B 
would  be  also  corrupted.  This  bidirectional  process  would  continue  until  the  system  is 
capable  of  genrating  A  and  B  (stable  memory  points).  Notice  that  a  more  daring  design 


could  couple  several  of  these  modules  forming  a  close  cicuit  where  each  module  would 
perform  one  type  of  association.  This  could  have  a  significant  impact  on  the  per¬ 
formance  of  the  pattern  recognition  system.  When  a  good  discrimination  cannot  be 
done  in  any  of  the  individual  feature  spaces,  it  might  be  possible  through  evidence 
accumulation  on  several  spaces. 


CONCLUSIONS 

The  implications  of  a  parallel  connectionist-like  design  for  vision  system  has  been 
discussed.  In  particular,  the  design  of  a  simple  object  recognition  system  based  on 
resonant  principles.  This  design  should  allow  us  to  tackle  problems  of  noise,  occlusion, 
and  ambiguity  in  a  fresh  new  light,  making  heavy  use  of  tightly  couplea  continuous 
feedback  systems  for  vision.  In  particular,  the  implementation  of  each  module  is  paral¬ 
lel  and  simple.  The  strength  of  the  design  is  derived  from  the  combination  of  modules 
and  the  resonant  structures. 

Some  research  questions,  with  great  impact  to  the  engineering  of  such  systems, 
still  remain  to  be  answered:  (1 )  To  what  degree  and  under  what  circumstances  is  this 
system  stable,  given  all  the  feedback  loops?  (2)  How  can  bidirectional  memories  be 
designed  with  very  large  sotrage  capacity  to  address  the  vision  problem?  Nevertheless, 
the  preliminary  results  are  promising,  and  the  possibilities  for  designing  vision  systems 
using  this  technology  indicate  that  it  might  soon  challenge  classical  systems  in  speed, 
fault  tolerance,  and  performance. 
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